Implementing LNS using filtering units of GPUs, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, p.423434, 2010. ,
DOI : 10.1109/ICASSP.2010.5495516
URL : https://hal.archives-ouvertes.fr/hal-00423434
Graphic processors to speedup simulations for the design of high performance solar receptors, ASAP, pp.377-382, 2007. ,
Etat de l'intégration de la virgule flottante dans les processeurs graphiques. Revue des sciences et technologies de l'information, pp.719-733, 2008. ,
Line-by-line spectroscopic simulations on graphics processing units, Computer Physics Communications, vol.178, issue.2, pp.135-143, 2008. ,
DOI : 10.1016/j.cpc.2007.08.013
Chapter 9 -interval arithmetic in cuda, GPU Computing Gems Jade Edition, pp.99-107, 2012. ,
Fonctions élémentaires sur gpu exploitant la localité de valeurs, SYMPosium en Architectures nouvelles de machines (SYMPA), pp.1-11, 2008. ,
Étude comparée et simulation d'algorithmes de branchements pour le gpgpu, SYMPosium en Architectures nouvelles de machines (SYMPA), 2009. ,
Barra: A Parallel Functional Simulator for GPGPU, 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, pp.351-360, 2010. ,
DOI : 10.1109/MASCOTS.2010.43
Full-speed deterministic bit-accurate parallel floating-point summation, 2014. ,
Barra, a Modular Functional GPU Simulator for GPGPU, 2009. ,
Power Consuption of GPUs from a Software Perspective, Lecture Notes in Computer Science, vol.5544, pp.922-931, 2009. ,
Dynamic detection of uniform and affine vectors in gpgpu computations, Europar 3rd Workshop on Highly Parallel Processing on a Chip (HPPC), pp.396719-396720, 2009. ,
A gpu interval library based on boost interval, Real Numbers and Computers, pp.61-72, 2008. ,
Implementation of float-float operators on graphics hardware, RNC7, pp.23-32, 2006. ,
URL : https://hal.archives-ouvertes.fr/hal-00021443
Software Carry-Save: A Case Study for Instruction-Level Parallelism, PaCT, pp.207-214, 2003. ,
DOI : 10.1007/978-3-540-45145-7_18
Collapsing dependent floating point operations, IMACS World Congress Scientific Computation, Applied Mathematics and Simulation, pp.1-10, 2005. ,
Prédictibilité des ordonnanceurs des gpu, SYMPosium en Architectures nouvelles de machines (SYMPA), pp.1-10, 2014. ,
Implémentation de l'opérateur add2, Research Report, vol.3, 2004. ,
Real-time simulation of power networks using multi-core architecture, DERBI 2012. DERBI, 2012. ,
FuzzyGPU: A Fuzzy Arithmetic Library for GPU, 2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, 2013. ,
DOI : 10.1109/PDP.2014.16
URL : https://hal.archives-ouvertes.fr/hal-00856617
Regularity Versus Load-balancing on GPU for Treefix Computations, 2013 International Conference on Computational Science, pp.309-318, 2013. ,
DOI : 10.1016/j.procs.2013.05.194
URL : https://hal.archives-ouvertes.fr/hal-00768293
FuzzyGPU: A Fuzzy Arithmetic Library for GPU, 2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, p.2014, 2014. ,
DOI : 10.1109/PDP.2014.16
URL : https://hal.archives-ouvertes.fr/hal-00856617
Températures, erreurs matérielles et gpu, SYMPosium en Architectures nouvelles de machines (SYMPA), pp.1-10, 2013. ,
GPUburn: A system to test and mitigate GPU hardware failures, 2013 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS), pp.263-270, 2013. ,
DOI : 10.1109/SAMOS.2013.6621133
URL : https://hal.archives-ouvertes.fr/hal-00827588
The instruction register file micro-architecture, GD05b] Bernard Goossens and David Defour. Ordonnancement dynamique distribué. In SympA, pp.767-773, 2005. ,
DOI : 10.1016/j.future.2004.05.017
URL : https://hal.archives-ouvertes.fr/lirmm-01206362
The instruction register file micro-architecture, GD06b] Bernard Goossens and David Defour. Ordonnancement distribué d'instructions. Technique et Science Informatiques, pp.767-773, 2006. ,
DOI : 10.1016/j.future.2004.05.017
URL : https://hal.archives-ouvertes.fr/lirmm-01206362
Cuda et les formats de représentation des nombres flottants, HPC Magazine, issue.4, pp.52-57, 2013. ,
Multipath execution : Opportunities and limits, International Conference on Supercomputing, pp.101-108, 1998. ,
Fuzzy Memoization for Floating-Point Multimedia Applications, IEEE Transactions on Computers, vol.54, issue.7, pp.922-927, 2005. ,
DOI : 10.1109/TC.2005.119
UNISIM: An Open Simulation Environment and Library for Complex Architecture Design and Collaborative Development, IEEE Computer Architecture Letters, vol.6, issue.2, pp.45-48, 2007. ,
DOI : 10.1109/L-CA.2007.12
Achieving Structural and Composable Modeling of Complex Systems, International Journal of Parallel Programming, vol.18, issue.6, pp.81-101, 2005. ,
DOI : 10.1007/s10766-005-3569-3
SimpleScalar: an infrastructure for computer system modeling, Computer, vol.35, issue.2, pp.59-67, 2002. ,
DOI : 10.1109/2.982917
A Fortran 90-based multiprecision system, ACM Transactions on Mathematical Software, vol.21, issue.4, pp.379-387, 1995. ,
DOI : 10.1145/212066.212075
Analyzing CUDA workloads using a detailed GPU simulator, 2009 IEEE International Symposium on Performance Analysis of Systems and Software, pp.163-174, 2009. ,
DOI : 10.1109/ISPASS.2009.4919648
Exploiting value locality in physical register files, 22nd Digital Avionics Systems Conference. Proceedings (Cat. No.03CH37449), 2003. ,
DOI : 10.1109/MICRO.2003.1253201
Contribution à l'algorithmique parallèle, Le concept d'asynchronisme : étude théorique, mise en oeuvre et application, 1998. ,
Reducing the latency of division operations with partial caching. Signals, Systems and Computers, Conference Record of the Thirty-Sixth Asilomar Conference on, pp.1598-1602, 2002. ,
A dynamic program analysis to find floating-point accuracy problems, Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '12, pp.453-462 ,
The m5 simulator : Modeling networked systems, IEEE Micro, vol.26, issue.4, pp.52-60, 2006. ,
Risk analysis of hazardous materials transportation: evaluating uncertainty by means of fuzzy logic, Journal of Hazardous Materials, vol.62, issue.1, pp.59-74, 1998. ,
DOI : 10.1016/S0304-3894(98)00158-7
The doubledouble library, 1998. ,
Accelerating correctly rounded floating-point division when the divisor is known in advance, IEEE Transactions on Computers, vol.53, issue.8, pp.1069-1072, 2004. ,
DOI : 10.1109/TC.2004.37
The design of the Boost interval arithmetic library, Theoretical Computer Science, vol.351, issue.1, pp.111-118, 2006. ,
DOI : 10.1016/j.tcs.2005.09.062
Wattch, ACM SIGARCH Computer Architecture News, vol.28, issue.2, pp.83-94, 2000. ,
DOI : 10.1145/342001.339657
GPUbench : evaluating gpu performance for numerical and scientifc application, Proceedings of the ACM Workshop on General- Purpose Computing on Graphics Processors, 2004. ,
Brook for GPUs : Stream computing on graphics hardware, Proceedings of SIGGRAPH 2004, pp.777-786, 2004. ,
Use-based register caching with decoupled indexing, Proceedings of the 31st Annual International Symposium on Computer Architecture, pp.302-313, 2004. ,
Reducing branch costs via branch alignment, 6th International Conference on Architectural Support for Programming Languages and Operating Systems, pp.242-251, 1994. ,
True Random Number Generator Using GPUs and Histogram Equalization Techniques, 2011 IEEE International Conference on High Performance Computing and Communications, pp.161-170, 2011. ,
DOI : 10.1109/HPCC.2011.30
The genetic algorithm based tuning method for symmetric membership functions of fuzzy logic control systems, Industrial Automation and Control : Emerging Technologies International IEEE/IAS Conference on, pp.421-428, 1995. ,
A fuzzy approach for supplier evaluation and selection in supply chain management, International Journal of Production Economics, vol.102, issue.2, pp.289-301, 2006. ,
DOI : 10.1016/j.ijpe.2005.03.009
Instruction prefetching using branch prediction information, ICCD, pp.593-601, 1997. ,
Rule-base self-generation and simplification for datadriven fuzzy models, Fuzzy Systems The 10th IEEE International Conference on, pp.424-427, 2001. ,
Hardware memoization of mathematical and trigonometric functions, 2000. ,
Solving lattice QCD systems of equations using mixed precision solvers on GPUs, Computer Physics Communications, vol.181, issue.9, pp.1517-1528, 2010. ,
DOI : 10.1016/j.cpc.2010.05.002
Algorithm 665: Machar: a subroutine to dynamically determined machine parameters, ACM Transactions on Mathematical Software, vol.14, issue.4, pp.303-311, 1988. ,
DOI : 10.1145/50063.51907
Algorithm 714; CELEFUNT: a portable test package for complex elementary functions, ACM Transactions on Mathematical Software, vol.19, issue.1, pp.1-21, 1993. ,
DOI : 10.1145/151271.151272
Algorithm 715; SPECFUN---a portable FORTRAN package of special function routines and test drivers, ACM Transactions on Mathematical Software, vol.19, issue.1, pp.22-30, 1993. ,
DOI : 10.1145/151271.151273
A Proposed Radix- and Word-length-independent Standard for Floating-point Arithmetic, IEEE Micro, vol.4, issue.4, pp.86-100, 1984. ,
DOI : 10.1109/MM.1984.291224
System and method for managing divergent threads in a SIMD architecture, 2008. ,
Multiple-banked register file architectures, Proceedings of the 27th Annual International Symposium on Computer Architecture, pp.316-325, 2000. ,
Method and system for approximating sine and cosine functions, 2001. ,
Generating high-performance custom floating-point pipelines, 19th International Conference on Field Programmable Logic and Applications, pp.59-64, 2009. ,
Taming irregular eda applications on gpus, Proceedings of the 2009 International Conference on Computer-Aided Design, ICCAD '09, pp.539-546, 2009. ,
Translating GPU binaries to tiered SIMD architectures with Ocelot, 2009. ,
Operations on fuzzy numbers, International Journal of Systems Science, vol.12, issue.6, pp.613-626, 1978. ,
DOI : 10.1016/S0019-9958(65)90241-X
Efficient dynamic scheduling through tag elimination, Proceedings of the 29th Annual International Symposium on Computer Architecture, pp.37-46, 2002. ,
GNU multiple precision arithmetic library ,
Atom : A flexible interface for building high performance program analysis tools, Proceedings of the Winter 1995 USENIX Technical Conference on UNIX and Advanced Computing Systems, pp.303-314, 1995. ,
The Cg Tutorial : The Definitive Guide to Programmable Real-Time Graphics, 2003. ,
Trace Scheduling: A Technique for Global Microcode Compaction, IEEE Transactions on Computers, vol.30, issue.7, pp.30478-490, 1981. ,
DOI : 10.1109/TC.1981.1675827
Toward defining the course of evolution : Minimum change for a specific tree topology, Syst Biol, vol.20, pp.406-416, 1971. ,
Efficient Ray Tracing Using Interval Analysis, Parallel Processing and Applied Mathematics, pp.1351-1360, 2008. ,
DOI : 10.1007/978-3-540-68111-3_143
Dynamic warp formation and scheduling for efficient gpu control flow, MICRO '07 : Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture, pp.407-420, 2007. ,
More on algorithms that reveal properties of floating point arithmetic units, Communications of the ACM, vol.17, issue.5, 1974. ,
Accelerating double precision fem simulations with GPUs, Proceedings of ASIM 2005 -18th Symposium on Simulation Technique, 2005. ,
Scalable Distributed Register File, Workshop on Complexity-effective Design held in conjunction with the 31st International Symposium on Computer Architecture, 2004. ,
Gaol 3.1. 1 : Not just another interval arithmetic library, Laboratoire d'Informatique de Nantes-Atlantique, 2006. ,
Test en ligne pour la détection des fautes intermittentes dans les architectures multiprocesseurs embarquées. These, 2011. ,
Impact of the application activity on intermittent faults in embedded systems, 29th VLSI Test Symposium, pp.191-196 ,
DOI : 10.1109/VTS.2011.5783782
Increasing the instruction fetch rate via block-structured instruction set architectures, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29, pp.449-478, 1998. ,
DOI : 10.1109/MICRO.1996.566461
Hard data on soft errors : A large-scale assessment of real-world error rates in gpgpu, Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, CCGRID '10, pp.691-696, 2010. ,
Accelerating Large Graph Algorithms on the GPU Using CUDA, Proceedings of the 14th international conference on High performance computing, HiPC'07, pp.197-208, 2007. ,
DOI : 10.1007/978-3-540-77220-0_21
Parallel graph component labelling with GPUs and CUDA, Parallel Computing, vol.36, issue.12, pp.655-678, 2010. ,
DOI : 10.1016/j.parco.2010.07.002
Analysis of online self-testing policies for real-time embedded multiprocessors in dsm technologies, IOLTS [6], pp.49-55 ,
Algorithms for quad-double precision floating point arithmetic, Proceedings 15th IEEE Symposium on Computer Arithmetic. ARITH-15 2001, pp.155-162, 2001. ,
DOI : 10.1109/ARITH.2001.930115
The microarchitecture of the Pentium 4 processor, In Intel technology journal, p.1, 2001. ,
On implementing graph cuts on cuda, First Workshop on General Purpose Processing on Graphics Processing Units, 2007. ,
A Register File Architecture and Compilation Scheme for Clustered ILP Processors, Proceedings of the 8th International Euro-Par Conference on Parallel Processing, pp.500-511, 2002. ,
DOI : 10.1007/3-540-45706-2_68
Efficient conditional operations for data-parallel architectures, MICRO 33 : Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture, pp.159-170, 2000. ,
PARANOIA : a floating-point benchmark, Byte, vol.10, issue.2, pp.223-235, 1985. ,
Reducing register ports using delayed write-back queues and operand pre-fetch, Proceedings of the 17th annual international conference on Supercomputing , ICS '03, pp.172-182, 2003. ,
DOI : 10.1145/782814.782839
Dynamic floating-point cancellation detection, Parallel Comput, vol.39, issue.3, pp.146-155, 2013. ,
More instruction level parallelism explains the actual efficiency of compensated algorithms, 2007. ,
URL : https://hal.archives-ouvertes.fr/hal-00165020
Arrondi correct de fonctions mathématiques : fonctions univariées et bivariées, certification et automatisation, 2008. ,
A large, fast instruction window for tolerating cache misses, Proceedings of the 29th Annual International Symposium on Computer Architecture, pp.59-70, 2002. ,
Communication-efficient parallel algorithms for distributed random-access machines, Algorithmica, vol.11, issue.2, pp.53-77, 1988. ,
DOI : 10.1007/BF01762110
NVIDIA Tesla: A Unified Graphics and Computing Architecture, IEEE Micro, vol.28, issue.2, pp.39-55, 2008. ,
DOI : 10.1109/MM.2008.31
Simulating multiported memories using lower port count memories, 2008. ,
Value locality and load value prediction, SIGOPS Oper. Syst. Rev, vol.30, issue.5, pp.138-147, 1996. ,
Towards parallel programming models for predictability editor, 12th International Workshop on Worst-Case Execution Time Analysis, Schloss Dagstuhl -Leibniz- Zentrum fuer Informatik, pp.48-58, 2012. ,
RAIDR, ISCA, pp.1-12, 2012. ,
DOI : 10.1145/2366231.2337161
Method for conditional branch execution in simd vector processors, US Patent, vol.4435, p.758, 1984. ,
An effective gpu implementation of breadthfirst search, Proceedings of the 47th Design Automation Conference, DAC '10, pp.52-55, 2010. ,
Simics : A full system simulation platform, Computer, vol.35, issue.2, pp.50-58, 2002. ,
Characterizing the impact of predicated execution on branch prediction, Proceedings of the 27th Annual International Symposium on Microarchitecture, pp.217-227, 1994. ,
Multifacet's general execution-driven multiprocessor simulator (gems) toolset, 2005. ,
Scalable GPU graph traversal, ACM SIGPLAN Notices, vol.47, issue.8, pp.117-128, 2012. ,
DOI : 10.1145/2370036.2145832
Data flow prescheduling for large instruction windows in out-oforder processors, Proceedings of the 7th International Symposium on High-Performance Computer Architecture, pp.27-36, 2001. ,
Fuzzy modelling of power system optimal load flow, Power Industry Computer Application Conference Conference Proceedings, pp.386-392, 1991. ,
Introduction to interval analysis, Society for Industrial and Applied Mathematics, 2009. ,
DOI : 10.1137/1.9780898717716
Shader Performance Analysis on a Modern GPU Architecture, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05), pp.355-364, 2005. ,
DOI : 10.1109/MICRO.2005.30
The vector floatingpoint unit in a synergistic processor element of a cell processor, pp.59-67, 2005. ,
Parallel data processing systems and methods using cooperative thread arrays and simd instruction issue, 2009. ,
On division and reciprocal caches, 1995. ,
A high-performance area-efficient multifunction interpolator, Proceedings of the 17th IEEE Symposium on Computer Arithmetic (Cap Cod, USA), pp.272-279, 2005. ,
Complexity-effective superscalar processors, Proceedings of the 24th Annual International Symposium on Computer Architecture, pp.206-218, 1997. ,
Reducing register ports for higher speed and lower energy, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings., pp.171-182, 2002. ,
DOI : 10.1109/MICRO.2002.1176248
Number theoretic test generation for directed rounding, pp.241-248, 1999. ,
Dynamic flow instruction cache memory organized around trace segments independent of virtual address line. US Patent 5, 1992. ,
Microlib : A case for the quantitative comparison of micro-architecture mechanisms, MICRO 37 : Proceedings of ,
URL : https://hal.archives-ouvertes.fr/inria-00001110
Scheduling instructions from multi-thread instruction buffer based on phase boundary qualifying rule for phases of math and data access operations with better caching, 2008. ,
High-performance 3-1 interlock collapsing ALU's, IEEE Transactions on Computers, vol.43, issue.3, pp.257-268, 1994. ,
DOI : 10.1109/12.272427
Wrong-path instruction prefetching, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29, pp.165-175, 1996. ,
DOI : 10.1109/MICRO.1996.566459
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.49.7281
A computer architecture for the dynamic optimization of high-level language programs, 1980. ,
Algorithms for arbitrary precision floating point arithmetic, [1991] Proceedings 10th IEEE Symposium on Computer Arithmetic, pp.132-144, 1991. ,
DOI : 10.1109/ARITH.1991.145549
Direct Instruction Wakeup for Out-of-Order Processors, Innovative Architecture for Future Generation High-Performance Processors and Systems (IWIA'04), pp.2-9, 2004. ,
DOI : 10.1109/IWIA.2004.10002
A scalable front-end architecture for fast instruction delivery, Proceedings of the 26th Annual International Symposium on Computer Architecture (ISCA'99) of Computer Architecture News, pp.234-245, 1999. ,
Fetch directed instruction prefetching, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture, pp.16-27, 1999. ,
DOI : 10.1109/MICRO.1999.809439
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.35.8870
The mpfi library, 2001. ,
URL : https://hal.archives-ouvertes.fr/inria-00544998
Exploiting trivial and redundant computation, Proceedings of the 11th IEEE Symposium on Computer Arithmetic, pp.220-227, 1993. ,
Using the SimOS machine simulator to study complex computer systems, ACM Transactions on Modeling and Computer Simulation, vol.7, issue.1, pp.78-103, 1997. ,
DOI : 10.1145/244804.244807
Trace cache: a low latency approach to high bandwidth instruction fetching, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29, pp.24-34, 1996. ,
DOI : 10.1109/MICRO.1996.566447
Precimonious, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on, SC '13, pp.1-2712, 2013. ,
DOI : 10.1145/2503210.2503296
Fast and parallel interval arithmetic, Bit Numerical Mathematics, vol.39, issue.3, pp.534-554, 1999. ,
DOI : 10.1023/A:1022374804152
Resource scheduling under uncertainty in a smart grid with renewables and plug-in vehicles [128] D. Sankoff. Minimal mutation trees of sequences, Systems Journal, IEEE SIAM Journal on Applied Mathematics, vol.6, issue.1, pp.103-109, 1975. ,
CADRE : Cycle-accurate deterministic replay for hardware debugging, DSN, pp.301-312, 2006. ,
Revisiting direct tag search algorithm on superscalar processors, Workshop on Complexity-effective Design held in conjunction with the 28th International Symposium on Computer Architecture, 2001. ,
The performance potential of data dependence speculation & colapsing, Proceedings of the 29th annual IEEE/ACM international symposium on Microarchitecture, pp.238-247, 1996. ,
DRAM errors in the wild, Communications of the ACM, vol.54, issue.2, pp.100-107, 2011. ,
DOI : 10.1145/1897816.1897844
A test of computer's floating-point arithmetic unit, 1981. ,
Scan primitives for gpu computing, Proceedings of the 22nd ACM SIGGRAPH/EUROGRAPHICS symposium on Graphics hardware, GH '07, pp.97-106, 2007. ,
SoftExplorer: Estimating and Optimizing the Power and Energy Consumption of a C Program for DSP Applications, EURASIP Journal on Advances in Signal Processing, vol.2005, issue.16, pp.2641-2654, 2005. ,
DOI : 10.1155/ASP.2005.2641
URL : https://hal.archives-ouvertes.fr/hal-00077302
Register write specialization register read specialization: a path to complexity-effective wide-issue superscalar processors, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings., pp.383-394, 2002. ,
DOI : 10.1109/MICRO.2002.1176265
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.20.2431
A flexible simulation framework for graphics architectures, Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware , HWWS '04, pp.85-94, 2004. ,
DOI : 10.1145/1058129.1058142
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.58.3303
Adaptive precision floating-point arithmetic and fast robust geometric predicates, Discrete and Computational Geometry, pp.305-363, 1997. ,
A compensation-based power flow method for weakly meshed distribution and transmission networks, IEEE Transactions on Power Systems, vol.3, issue.2, pp.753-762, 1988. ,
DOI : 10.1109/59.192932
A Proposed Standard for Binary Floating-Point Arithmetic, Computer, vol.14, issue.3, pp.51-62, 1981. ,
DOI : 10.1109/C-M.1981.220377
An American national standard : IEEE standard for binary floating point arithmetic, ACM SIGPLAN Notices, vol.22, issue.2, pp.9-25, 1987. ,
Virtual 16 bit precise operations on rgba8 textures, Proceedings of Vision, Modeling, and Visualization, pp.171-178, 2002. ,
Hierarchical registers for scientific computers, Proceedings of the 2nd international conference on Supercomputing , ICS '88, pp.346-353, 1988. ,
DOI : 10.1145/55364.55398
Prédicteurs mixtes pour l'anticipation des instructions, 5ème Symposium sur les Architectures Nouvelles de Machines (SYMPA'5), pp.165-174, 1999. ,
Revisiting spacetrack report #3, Proceedings of the AIAA/AAS Astrodynamics Specialist Conference, 2006. ,
Interlock collapsing ALU's, IEEE Transactions on Computers, vol.42, issue.7, pp.825-839, 1993. ,
DOI : 10.1109/12.237723
NON-SEQUENTIAL INSTRUCTION CACHE PREFETCHING FOR MULTIPLE???ISSUE PROCESSORS, International Journal of High Speed Computing, vol.10, issue.01, pp.115-140, 1999. ,
DOI : 10.1142/S0129053399000065
A precision- and range-independent tool for testing floating-point arithmetric I: basic operations, square root, and remainder, ACM Transactions on Mathematical Software, vol.27, issue.1, pp.92-118, 2001. ,
DOI : 10.1145/382043.382404
Synthesis of Floating-Point Addition Clusters on FPGAs Using Carry-Save Arithmetic, 2010 International Conference on Field Programmable Logic and Applications, pp.19-24, 2010. ,
DOI : 10.1109/FPL.2010.15
Better performance at lower occupancy, Proceedings of the GPU Technology Conference, 2010. ,
OPTIMIZATION OF LINKED LIST PREFIX COMPUTATIONS ON MULTITHREADED GPUS USING CUDA, Parallel Distributed Processing (IPDPS), 2010 IEEE International Symposium on, pp.1-8, 2010. ,
DOI : 10.1142/S0129626412500120
Instruction issue logic for pipelined supercomputers, Proceedings of the 11th International Symposium on Computer Architecture, pp.110-118, 1984. ,
Fermi GF100 GPU Architecture, IEEE Micro, vol.31, issue.2, pp.50-59, 2011. ,
DOI : 10.1109/MM.2011.24
Demystifying GPU microarchitecture through microbenchmarking, 2010 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS) ,
DOI : 10.1109/ISPASS.2010.5452013
The design and use of simplepower, Proceedings of the 37th conference on Design automation , DAC '00, pp.340-345, 2000. ,
DOI : 10.1145/337292.337436
A new evaluation of mean value for fuzzy numbers and its application to american put option under uncertainty. Fuzzy Sets and Systems, pp.2614-2626, 2006. ,
Caching processor general registers, Proceedings of ICCD '95 International Conference on Computer Design. VLSI in Computers and Processors, pp.307-312, 1995. ,
DOI : 10.1109/ICCD.1995.528826
The role of fuzzy logic in the management of uncertainty in expert systems, Fuzzy Sets and Systems, vol.11, issue.1-3, pp.197-198, 1983. ,
DOI : 10.1016/S0165-0114(83)80081-5
Hierarchical clustered register file organization for VLIW processors, Proceedings International Parallel and Distributed Processing Symposium, p.77, 2003. ,
DOI : 10.1109/IPDPS.2003.1213178
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.76.9449