gpus1 Ubuntu 10.04 Available via ssh

Similar documents
Overview. Lecture 1: an introduction to CUDA. Hardware view. Hardware view. hardware view software view CUDA programming

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011

NVIDIA GeForce GTX 580 GPU Datasheet

Introduction to GPGPU. Tiziano Diamanti

Next Generation GPU Architecture Code-named Fermi

Introduction to GP-GPUs. Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1

Introduction GPU Hardware GPU Computing Today GPU Computing Example Outlook Summary. GPU Computing. Numerical Simulation - from Models to Software

Introduction to GPU hardware and to CUDA

NVIDIA CUDA GETTING STARTED GUIDE FOR MICROSOFT WINDOWS

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

Programming models for heterogeneous computing. Manuel Ujaldón Nvidia CUDA Fellow and A/Prof. Computer Architecture Department University of Malaga

Getting Started with CodeXL

GPU Parallel Computing Architecture and CUDA Programming Model

OpenCL Optimization. San Jose 10/2/2009 Peng Wang, NVIDIA

Mitglied der Helmholtz-Gemeinschaft. OpenCL Basics. Parallel Computing on GPU and CPU. Willi Homberg. 23. März 2011

QuickSpecs. NVIDIA Quadro K2200 4GB Graphics INTRODUCTION. NVIDIA Quadro K2200 4GB Graphics. Technical Specifications

QuickSpecs. NVIDIA Quadro M GB Graphics INTRODUCTION. NVIDIA Quadro M GB Graphics. Overview

CUDA Basics. Murphy Stein New York University

Experiences on using GPU accelerators for data analysis in ROOT/RooFit

GPGPU Computing. Yong Cao

AMD CodeXL 1.7 GA Release Notes

Introduction to GPU Architecture

HP Workstations graphics card options

CUDA programming on NVIDIA GPUs

QuickSpecs. NVIDIA Quadro K5200 8GB Graphics INTRODUCTION. NVIDIA Quadro K5200 8GB Graphics. Technical Specifications

Transcend the Vision. Embedded Graphic Solutions that Lead to New Territory. Embedded Graphic Solutions.

Lecture 1: an introduction to CUDA

QuickSpecs. NVIDIA Quadro K1200 4GB Graphics INTRODUCTION PERFORMANCE AND FEATURES. Overview

QuickSpecs. NVIDIA Quadro K4200 4GB Graphics INTRODUCTION. NVIDIA Quadro K4200 4GB Graphics. Overview

Ensure that the AMD APP SDK Samples package has been installed before proceeding.

NVIDIA CUDA INSTALLATION GUIDE FOR MICROSOFT WINDOWS

Several tips on how to choose a suitable computer

QuickSpecs. NVIDIA Quadro K5200 8GB Graphics INTRODUCTION. NVIDIA Quadro K5200 8GB Graphics. Overview. NVIDIA Quadro K5200 8GB Graphics J3G90AA

GPU Architectures. A CPU Perspective. Data Parallelism: What is it, and how to exploit it? Workload characteristics

GPUs for Scientific Computing

Optimizing Application Performance with CUDA Profiling Tools

Introduction to OpenCL Programming. Training Guide

FLOATING-POINT ARITHMETIC IN AMD PROCESSORS MICHAEL SCHULTE AMD RESEARCH JUNE 2015

ST810 Advanced Computing

Several tips on how to choose a suitable computer

NVIDIA CUDA GETTING STARTED GUIDE FOR MAC OS X

GPU Profiling with AMD CodeXL

Getting Started with the ZED 2 Introduction... 2

GPU Computing - CUDA

The Evolution of Computer Graphics. SVP, Content & Technology, NVIDIA

E6895 Advanced Big Data Analytics Lecture 14:! NVIDIA GPU Examples and GPU on ios devices

NVIDIA CUDA GETTING STARTED GUIDE FOR MAC OS X

Applications to Computational Financial and GPU Computing. May 16th. Dr. Daniel Egloff

================================================================== CONTENTS ==================================================================

TESLA C2050/2070 COMPUTING PROCESSOR INSTALLATION GUIDE

Reviewer s Guide for Remote 3D Graphics Apps

A quick tutorial on Intel's Xeon Phi Coprocessor

ATI Radeon 4800 series Graphics. Michael Doggett Graphics Architecture Group Graphics Product Group

Data Sheet. Desktop ESPRIMO. General

Intel Media Server Studio - Metrics Monitor (v1.1.0) Reference Manual

Configuring Memory on the HP Business Desktop dx5150

OpenACC 2.0 and the PGI Accelerator Compilers

GPGPU Parallel Merge Sort Algorithm

Stream Processing on GPUs Using Distributed Multimedia Middleware

Installing Emageon PACS Remote Ultravisual

GPU Tools Sandra Wienke

GRID VGPU FOR VMWARE VSPHERE

AMD GPU Architecture. OpenCL Tutorial, PPAM Dominik Behr September 13th, 2009

Radeon GPU Architecture and the Radeon 4800 series. Michael Doggett Graphics Architecture Group June 27, 2008

ultra fast SOM using CUDA

Boundless Security Systems, Inc.

OpenCL Programming for the CUDA Architecture. Version 2.3

Introduction to GPU Programming Languages

Course materials. In addition to these slides, C++ API header files, a set of exercises, and solutions, the following are useful:

Parallel Image Processing with CUDA A case study with the Canny Edge Detection Filter

RWTH GPU Cluster. Sandra Wienke November Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky

GPU File System Encryption Kartik Kulkarni and Eugene Linkov

CLOUD GAMING WITH NVIDIA GRID TECHNOLOGIES Franck DIARD, Ph.D., SW Chief Software Architect GDC 2014

HP Z Workstations graphics card options

Java GPU Computing. Maarten Steur & Arjan Lamers

Introducing PgOpenCL A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child

SAPPHIRE TOXIC R9 270X 2GB GDDR5 WITH BOOST

HP Workstations graphics card options

GPU Hardware and Programming Models. Jeremy Appleyard, September 2015

TEGRA X1 DEVELOPER TOOLS SEBASTIEN DOMINE, SR. DIRECTOR SW ENGINEERING

Precision & Performance: Floating Point and IEEE 754 Compliance for NVIDIA GPUs

GPU Usage. Requirements

CUDA Debugging. GPGPU Workshop, August Sandra Wienke Center for Computing and Communication, RWTH Aachen University

Parallel Programming Survey

How To Use An Amd Ramfire R7 With A 4Gb Memory Card With A 2Gb Memory Chip With A 3D Graphics Card With An 8Gb Card With 2Gb Graphics Card (With 2D) And A 2D Video Card With


AMD Radeon HD 8000M Series GPU Specifications AMD Radeon HD 8870M Series GPU Feature Summary

Installation Guide. (Version ) Midland Valley Exploration Ltd 144 West George Street Glasgow G2 2HG United Kingdom

IP Video Rendering Basics

GeoImaging Accelerator Pansharp Test Results

System Requirements G E N E R A L S Y S T E M R E C O M M E N D A T I O N S

Optimization. NVIDIA OpenCL Best Practices Guide. Version 1.0

L20: GPU Architecture and Models

NVIDIA workstation 3D graphics card upgrade options deliver productivity improvements and superior image quality

AMD APP SDK v2.8 FAQ. 1 General Questions

Introduction to Numerical General Purpose GPU Computing with NVIDIA CUDA. Part 1: Hardware design and programming model

Connectivity Pack for Microsoft Guide

Guided Performance Analysis with the NVIDIA Visual Profiler

Brainlab Node TM Technical Specifications

Transcription:

gpus1 Ubuntu 10.04 Available via ssh root@gpus1:[~]#lspci -v grep VGA 01:04.0 VGA compatible controller: Matrox Graphics, Inc. MGA G200eW WPCM450 (rev 0a) 03:00.0 VGA compatible controller: nvidia Corporation GF100 [Tesla C2050 / C2070] (rev a3) root@gpus1:[~]#cd /gpusrc/nvidia_gpu_computing_sdk/c/bin/linux/release/ root@gpus1:[/gpusrc/nvidia_gpu_computing_sdk/c/bin/linux/release]#./devicequery./devicequery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) There is 1 device supporting CUDA Device 0: "Tesla C2070" CUDA Driver Version: 3.20 CUDA Runtime Version: 3.20 CUDA Capability Major/Minor version number: 2.0 Total amount of global memory: 5636554752 bytes Multiprocessors x Cores/MP = Cores: 14 (MP) x 32 (Cores/MP) = 448 (Cores) Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 32768 Warp size: 32 Maximum number of threads per block: 1024 Maximum sizes of each dimension of a block: 1024 x 1024 x 64 Maximum sizes of each dimension of a grid: 65535 x 65535 x 1 Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Clock rate: 1.15 GHz Concurrent copy and execution: Yes Run time limit on kernels: Integrated: Support host page-locked memory mapping: Yes Compute mode: Default (multiple host threads can use this device simultaneously) Concurrent kernel execution: Yes Device has ECC support enabled: Yes Device is using TCC driver mode: devicequery, CUDA Driver = CUDART, CUDA Driver Version = 3.20, CUDA Runtime Version = 3.20, NumDevs = 1, Device = Tesla C2070 PASSED ======================================

gpus2 Ubuntu 10.04 Available via ssh root@gpus2:[~]#lspci -v grep VGA 01:04.0 VGA compatible controller: Matrox Graphics, Inc. MGA G200eW WPCM450 (rev 0a) 06:00.0 VGA compatible controller: ATI Technologies Inc Antilles [AMD Radeon HD 6990] root@gpus2:[~]#fglrxinfo display: :0.0 screen: 0 OpenGL vendor string: ATI Technologies Inc. OpenGL renderer string: AMD Radeon HD 6990 OpenGL version string: 4.1.10750 Compatibility Profile Context dhe@gpus2:[/gpusrc/amd-app-sdk-v2.4-lnx64/bin/x86_64]$./clinfo Number of platforms: 1 Platform Profile: FULL_PROFILE Platform Version: OpenCL 1.1 AMD-APP-SDK-v2.4 (595.10) Platform Name: AMD Accelerated Parallel Processing Platform Vendor: Advanced Micro Devices, Inc. Platform Extensions: cl_khr_icd cl_amd_event_callback cl_amd_offline_devices Platform Name: AMD Accelerated Parallel Processing Number of devices: 1 Device Type: CL_DEVICE_TYPE_CPU Device ID: 4098 Max compute units: 24 Max work items dimensions: 3 Max work items[0]: 1024 Max work items[1]: 1024 Max work items[2]: 1024 Max work group size: 1024 Preferred vector width char: 16 Preferred vector width short: 8 Preferred vector width int: 4 Preferred vector width long: 2 Preferred vector width float: 4 Preferred vector width double: 0 Native vector width char: 16 Native vector width short: 8 Native vector width int: 4 Native vector width long: 2 Native vector width float: 4 Native vector width double: 0 Max clock frequency: 800Mhz Address bits: 64 Max memory allocation: 16923791360 Image support: Yes Max number of images read arguments: 128

Max number of images write arguments: 8 Max image 2D width: 8192 Max image 2D height: 8192 Max image 3D width: 2048 Max image 3D height: 2048 Max image 3D depth: 2048 Max samplers within kernel: 16 Max size of kernel argument: 4096 Alignment (bits) of base address: 1024 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: Yes Quiet NaNs: Yes Round to nearest even: Yes Round to zero: Yes Round to +ve and infinity: Yes IEEE754-2008 fused multiply-add: Cache type: Read/Write Cache line size: 64 Cache size: 65536 Global memory size: 67695165440 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Global Local memory size: 32768 Kernel Preferred work group size multiple: 1 Error correction support: 0 Unified memory for Host and Device: 1 Profiling timer resolution: 1 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities: Execute OpenCL kernels: Yes Execute native function: Yes Queue properties: Out-of-Order: Profiling : Yes Platform ID: 0x7f23cec69800 Name: AMD Opteron(tm) Processor 6168 Vendor: AuthenticAMD Driver version: 2.0 Profile: FULL_PROFILE Version: OpenCL 1.1 AMD-APP-SDK-v2.4 (595.10) Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_attribute_query cl_amd_vec3 cl_amd_media_ops

cl_amd_popcnt cl_amd_printf dhe@gpus2:[/gpusrc/amd-app-sdk-v2.4-lnx64/bin/x86_64]$ =================================================================== gpus3 Windows 2008r2 Windows 2008 R2 Server ready to test. You can use the standard rdp option to connect and test. Rdesktop to gpus3.cise.ufl.edu. If you need access to the Nvidia Control Panel applet, download and install Tightvnc 2.0.3 Viewer from the link below. http://www.tightvnc.com/download.php Enter the vncviewer password to connect. Vncviewer password Vncv;ew Packages installed: Furmark GPU Stress Test GPU Caps Viewer Visual Studio 2008 Visual Studio 2010 Microsoft DirectX SDK Nvidia 3D Vision Controller and Drivers Nvidia Cuda Toolkit v.4 Nvidia Cuda Tools API Nvidia GPU Computing SDK 4.0 Nvidia Parallel Nsight 2.0 for Visual Studio Nvidia Physx System Software Tightvnc Server ================================================= gpus4 Ubuntu 10.04 Available via ssh root@gpus4:[~]#lspci -v grep VGA 01:04.0 VGA compatible controller: Matrox Graphics, Inc. MGA G200eW WPCM450 (rev 0a) 05:00.0 VGA compatible controller: nvidia Corporation Device 1088 (rev a1) root@gpus4:[/gpusrc/nvidia_gpu_computing_sdk/c/bin/linux/release]#./devicequery [devicequery] starting..../devicequery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Found 2 CUDA Capable device(s) Device 0: "GeForce GTX 590" CUDA Driver Version / Runtime Version 4.0 / 4.0

CUDA Capability Major/Minor version number: 2.0 Total amount of global memory: 1536 MBytes (1610285056 bytes) (16) Multiprocessors x (32) CUDA Cores/MP: 512 CUDA Cores GPU Clock Speed: 1.26 GHz Memory Clock rate: 1728.00 Mhz Memory Bus Width: 384-bit L2 Cache Size: 786432 bytes Max Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536,65535), 3D=(2048,2048,2048) Max Layered Texture Size (dim) x layers 1D=(16384) x 2048, 2D=(16384,16384) x 2048 Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 32768 Warp size: 32 Maximum number of threads per block: 1024 Maximum sizes of each dimension of a block: 1024 x 1024 x 64 Maximum sizes of each dimension of a grid: 65535 x 65535 x 65535 Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and execution: Yes with 1 copy engine(s) Run time limit on kernels: Integrated GPU sharing Host Memory: Support host page-locked memory mapping: Yes Concurrent kernel execution: Yes Alignment requirement for Surfaces: Yes Device has ECC support enabled: Device is using TCC driver mode: Device supports Unified Addressing (UVA): Yes Device PCI Bus ID / PCI location ID: 6 / 0 Compute Mode: < Default (multiple host threads can use ::cudasetdevice() with device simultaneously) > Device 1: "GeForce GTX 590" CUDA Driver Version / Runtime Version 4.0 / 4.0 CUDA Capability Major/Minor version number: 2.0 Total amount of global memory: 1536 MBytes (1610285056 bytes) (16) Multiprocessors x (32) CUDA Cores/MP: 512 CUDA Cores GPU Clock Speed: 1.26 GHz Memory Clock rate: 1728.00 Mhz Memory Bus Width: 384-bit L2 Cache Size: 786432 bytes Max Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536,65535), 3D=(2048,2048,2048) Max Layered Texture Size (dim) x layers 1D=(16384) x 2048, 2D=(16384,16384) x 2048 Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 32768 Warp size: 32 Maximum number of threads per block: 1024 Maximum sizes of each dimension of a block: 1024 x 1024 x 64 Maximum sizes of each dimension of a grid: 65535 x 65535 x 65535

Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and execution: Yes with 1 copy engine(s) Run time limit on kernels: Integrated GPU sharing Host Memory: Support host page-locked memory mapping: Yes Concurrent kernel execution: Yes Alignment requirement for Surfaces: Yes Device has ECC support enabled: Device is using TCC driver mode: Device supports Unified Addressing (UVA): Yes Device PCI Bus ID / PCI location ID: 5 / 0 Compute Mode: < Default (multiple host threads can use ::cudasetdevice() with device simultaneously) > devicequery, CUDA Driver = CUDART, CUDA Driver Version = 4.0, CUDA Runtime Version = 4.0, NumDevs = 2, Device = GeForce GTX 590, Device = GeForce GTX 590 [devicequery] test results... PASSED ========================================================= gpus5 Ubuntu 10.04 Available via ssh root@gpus5:[~]#lspci grep VGA 01:04.0 VGA compatible controller: Matrox Graphics, Inc. MGA G200eW WPCM450 (rev 0a) 03:00.0 VGA compatible controller: ATI Technologies Inc Cayman PRO [AMD Radeon 6900 Series] 41:00.0 VGA compatible controller: ATI Technologies Inc Cayman PRO [AMD Radeon 6900 Series] root@gpus5:[~]#fglrxinfo display: :0.0 screen: 0 OpenGL vendor string: ATI Technologies Inc. OpenGL renderer string: AMD Radeon HD 6900 Series OpenGL version string: 4.1.10750 Compatibility Profile Context root@gpus5:[/gpusrc/amd-app-sdk-v2.4-lnx64/bin/x86_64]#./clinfo Number of platforms: 1 Platform Profile: FULL_PROFILE Platform Version: OpenCL 1.1 AMD-APP-SDK-v2.4 (595.10) Platform Name: AMD Accelerated Parallel Processing Platform Vendor: Advanced Micro Devices, Inc. Platform Extensions: cl_khr_icd cl_amd_event_callback cl_amd_offline_devices Platform Name: AMD Accelerated Parallel Processing Number of devices: 2 Device Type: CL_DEVICE_TYPE_GPU Device ID: 4098

Max compute units: 22 Max work items dimensions: 3 Max work items[0]: 256 Max work items[1]: 256 Max work items[2]: 256 Max work group size: 256 Preferred vector width char: 16 Preferred vector width short: 8 Preferred vector width int: 4 Preferred vector width long: 2 Preferred vector width float: 4 Preferred vector width double: 0 Native vector width char: 16 Native vector width short: 8 Native vector width int: 4 Native vector width long: 2 Native vector width float: 4 Native vector width double: 0 Max clock frequency: 0Mhz Address bits: 32 Max memory allocation: 134217728 Image support: Yes Max number of images read arguments: 128 Max number of images write arguments: 8 Max image 2D width: 8192 Max image 2D height: 8192 Max image 3D width: 2048 Max image 3D height: 2048 Max image 3D depth: 2048 Max samplers within kernel: 16 Max size of kernel argument: 1024 Alignment (bits) of base address: 32768 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: Quiet NaNs: Yes Round to nearest even: Yes Round to zero: Yes Round to +ve and infinity: Yes IEEE754-2008 fused multiply-add: Yes Cache type: ne Cache line size: 0 Cache size: 0 Global memory size: 536870912 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Scratchpad Local memory size: 32768 Kernel Preferred work group size multiple: 64

Error correction support: 0 Unified memory for Host and Device: 0 Profiling timer resolution: 1 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities: Execute OpenCL kernels: Yes Execute native function: Queue properties: Out-of-Order: Profiling : Yes Platform ID: 0x7f859d4dc800 Name: Cayman Vendor: Advanced Micro Devices, Inc. Driver version: CAL 1.4.1385 Profile: FULL_PROFILE Version: OpenCL 1.1 AMD-APP-SDK-v2.4 (595.10) Extensions: cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_printf cl_amd_media_ops cl_amd_popcnt Device Type: CL_DEVICE_TYPE_CPU Device ID: 4098 Max compute units: 24 Max work items dimensions: 3 Max work items[0]: 1024 Max work items[1]: 1024 Max work items[2]: 1024 Max work group size: 1024 Preferred vector width char: 16 Preferred vector width short: 8 Preferred vector width int: 4 Preferred vector width long: 2 Preferred vector width float: 4 Preferred vector width double: 0 Native vector width char: 16 Native vector width short: 8 Native vector width int: 4 Native vector width long: 2 Native vector width float: 4 Native vector width double: 0 Max clock frequency: 800Mhz Address bits: 64 Max memory allocation: 16923774976 Image support: Yes Max number of images read arguments: 128

Max number of images write arguments: 8 Max image 2D width: 8192 Max image 2D height: 8192 Max image 3D width: 2048 Max image 3D height: 2048 Max image 3D depth: 2048 Max samplers within kernel: 16 Max size of kernel argument: 4096 Alignment (bits) of base address: 1024 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: Yes Quiet NaNs: Yes Round to nearest even: Yes Round to zero: Yes Round to +ve and infinity: Yes IEEE754-2008 fused multiply-add: Cache type: Read/Write Cache line size: 64 Cache size: 65536 Global memory size: 67695099904 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Global Local memory size: 32768 Kernel Preferred work group size multiple: 1 Error correction support: 0 Unified memory for Host and Device: 1 Profiling timer resolution: 1 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities: Execute OpenCL kernels: Yes Execute native function: Yes Queue properties: Out-of-Order: Profiling : Yes Platform ID: 0x7f859d4dc800 Name: AMD Opteron(tm) Processor 6168 Vendor: AuthenticAMD Driver version: 2.0 Profile: FULL_PROFILE Version: OpenCL 1.1 AMD-APP-SDK-v2.4 (595.10) Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_attribute_query cl_amd_vec3 cl_amd_media_ops

cl_amd_popcnt cl_amd_printf root@gpus5:[/gpusrc/amd-app-sdk-v2.4-lnx64/bin/x86_64]# ===================================================== Important Directory tes: /gpusrc This is a local place that you can copy your code and store your own personal copy of the sdk. Warning: This space is not backed up! There is a modified cuda SDK in /gpusrc. NVIDIA_GPU_Computing_SDK The ATI Dev Kit can also be found in /gpusrc under: AMD-APP-SDK-v2.4-lnx64 There are some path-ing issues with the default download which have been corrected in the sdk copied there. ==================================================== CUDA Programming tes: * Please make sure your PATH includes /usr/local/cuda/bin * Please make sure your LD_LIBRARY_PATH * for 32-bit Linux distributions includes /usr/local/cuda/lib * for 64-bit Linux distributions includes /usr/local/cuda/lib64:/usr/local/cuda/lib * OR * for 32-bit Linux distributions add /usr/local/cuda/lib * for 64-bit Linux distributions add /usr/local/cuda/lib64 and /usr/local/cuda/lib * to /etc/ld.so.conf and run ldconfig as root * Please read the release notes in /usr/local/cuda/doc/ These paths can automatically be setup for you, by dot'ting your shell with either of these two files, depending on the shell you are using: root@gpus4:[/gpusrc]#ls cuda_bash_setup cuda_tcsh_setup example: $. cuda_bash_setup ============================================= ATI NOTES: You must have amd/ati libraries in your path for development, to make this easier for you, there is an example setup under /gpusrc you can simply dot your shell with this file. opencl_bash_setup

$. opencl_bash_setup