PyTorchProfiler¶
- class lightning.pytorch.profilers.PyTorchProfiler(dirpath=None, filename=None, group_by_input_shapes=False, emit_nvtx=False, export_to_chrome=True, row_limit=20, sort_by_key=None, record_module_names=True, table_kwargs=None, **profiler_kwargs)[source]¶
Bases:
ProfilerThis profiler uses PyTorch’s Autograd Profiler and lets you inspect the cost of different operators inside your model - both on the CPU and GPU.
- Parameters:
dirpath¶ (
Union[str,Path,None]) – Directory path for thefilename. IfdirpathisNonebutfilenameis present, thetrainer.log_dir(fromTensorBoardLogger) will be used.filename¶ (
Optional[str]) – If present, filename where the profiler results will be saved instead of printing to stdout. The.txtextension will be used automatically.group_by_input_shapes¶ (
bool) – Include operator input shapes and group calls by shape.Context manager that makes every autograd operation emit an NVTX range Run:
nvprof --profile-from-start off -o trace_name.prof -- <regular command here>
To visualize, you can either use:
nvvp trace_name.prof torch.autograd.profiler.load_nvprof(path)
export_to_chrome¶ (
bool) – Whether to export the sequence of profiled operators for Chrome. It will generate a.jsonfile which can be read by Chrome.row_limit¶ (
int) – Limit the number of rows in a table,-1is a special value that removes the limit completely.sort_by_key¶ (
Optional[str]) – Attribute used to sort entries. By default they are printed in the same order as they were registered. Valid keys include:cpu_time,cuda_time,cpu_time_total,cuda_time_total,cpu_memory_usage,cuda_memory_usage,self_cpu_memory_usage,self_cuda_memory_usage,count.record_module_names¶ (
bool) – Whether to add module names while recording autograd operation.table_kwargs¶ (
Optional[dict[str,Any]]) – Dictionary with keyword arguments for the summary table.**profiler_kwargs¶ (
Any) – Keyword arguments for the PyTorch profiler. This depends on your PyTorch version
- Raises:
MisconfigurationException – If arg
sort_by_keyis not present inAVAILABLE_SORT_KEYS. If argscheduleis not aCallable. If argscheduledoes not return atorch.profiler.ProfilerAction.