インテル® VTune™ Amplifier 2018 ヘルプ

Running gpu-profiling Analysis from the Command Line

Use the GPU In-kernel Profiling to analyze GPU kernel execution per code line and identify performance issues caused by memory latency or inefficient kernel algorithms.

Note

This analysis type is available on the processors based on Intel® microarchitecture code name Broadwell and later.

The GPU In-kernel Profiling instruments your code and, depending on your configuration settings, helps identify performance-critical basic blocks or issues caused by memory accesses in the GPU kernels.

Since the GPU In-kernel Profiling incurs higher performance overhead than the GPU Hotspots analysis, you may consider first running the GPU Hotspots analysis to identify the hottest GPU computing task (GPU kernel) and then exploring this kernel with the GPU In-kernel Profiling.

GPU In-kernel profiling introduces the following key metrics:

Syntax:

$ amplxe-cl -collect gpu-profiling [-knob <knobName=knobValue>] -- <target> [target_options]

Knobs: gpu-profiling-mode, kernels-to-profile.

Note

For the most current information on available knobs (configuration options) for the GPU In-kernel Profiling, enter:

$ amplxe-cl -help collect gpu-profiling

Example:

This example runs GPU In-kernel Profiling for a Linux target analyzing only the specified kernel1 and kernel2 with the sampling interval equal to 10 kernels.

$ amplxe-cl -collect gpu-profiling -knob gpu-profiling-mode=memlatency -knob kernels-to-profile=kernel1:1:10:4294967185,kernel2:1:10:4294967185 -- home/test/myApplication

What's Next

When the data collection is complete, do one of the following to view the result:

関連情報