Third-party device integration: PrivateUse1 torch.sparse now includes prototype support for semi-structured (2:4) sparsity on NVIDIA® GPUs.ĬPU optimizations for scaled-dot-product-attention (SDPA). torch.export, a sound full-graph capture mechanism is introduced as a prototype feature, as well as torch.export-based quantization.bfloat16 support and dynamic shapes), AVX512 kernel support, and scaled-dot-product-attention kernels. New CPU performance features include inductor improvements (e.g.pile now includes improved support for Python 3.11.pile can now compile NumPy operations via translating them into PyTorch-equivalent operations.enables saving and loading models from multiple ranks in parallel, as well as resharding due to changes in cluster topology.pile now includes automatic support for detecting and minimizing recompilations due to tensor shape changes using automatic dynamic shapes.More information about how to get started with the PyTorch 2-series can be found at our Getting Started page. As always, we encourage you to try these out and report any issues as we improve 2.1. We want to sincerely thank our dedicated community for your contributions. This release is composed of 6,682 commits and 784 contributors since 2.0. More details can be found in the library updates blog. CPU inductor improvements, AVX512 support, scaled-dot-product-attention support) as well as a prototype release of torch.export, a sound full-graph capture mechanism, and torch.export-based quantization.Īlong with 2.1, we are also releasing a series of updates to the PyTorch domain libraries. In addition, this release offers numerous performance improvements (e.g. We are excited to announce the release of PyTorch® 2.1! PyTorch 2.1 offers automatic dynamic shape support in pile, for saving/loading distributed training jobs on multiple ranks in parallel, and pile support for the NumPy API.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |