o dÛ·iÊ<ã @sJddlZddlZddlmZddlmZddlmZmZm Z m Z mZmZm Z mZmZddlZddlZddlmZmZedddZerHdd lmZgd ¢ZdZded edfdd„Zded efdd„ZdIded efdd„Zdejj dedejj fdd„Z!ded efdd„Z"ded ed efd!d"„Z# dJd#eed$eed e egeffd%d&„Z$Gd'd(„d(ƒZ% dKd)ej&d*ee'd+e'd eej&fd,d-„Z(dKd.d/„Z)d0ej&d ej&fd1d2„Z*d3ej&d4ee'd ej&fd5d6„Z+ej,d7e-d8ed9efd:d;„ƒZ.ej,dd?„ƒZ/Gd@dA„dAe0ƒZ1dBed e'fdCdD„Z2dBed e'fdEdF„Z3dBed e'fdGdH„Z4dS)LéN)Úwraps)ÚMappingProxyType) Ú TYPE_CHECKINGÚAnyÚCallableÚDictÚIterableÚListÚMappingÚOptionalÚTypeVar)Ú AutoConfigÚPretrainedConfigÚTr)Úbound©ÚModelCompressor)Ú"infer_compressor_from_model_configÚfix_fsdp_module_nameÚtensor_follows_mask_structureÚreplace_moduleÚis_compressed_tensors_configÚ getattr_chainÚ deprecatedÚ AliasableÚcombine_shardsÚshard_tensorÚ pack_bitmasksÚunpack_bitmasksÚ patch_attrÚpatch_attrsÚParameterizedDefaultDictÚget_num_attn_headsÚget_num_kv_headsÚget_head_dimÚ_fsdp_wrapped_moduleÚpretrained_model_name_or_pathÚreturnrcCsfddlm}ddlm}t |¡}| |¡}|durdS| d¡}|j|fi|¤Ž}|j||d}|S)a Given a path to a model config, extract a sparsity config if it exists and return the associated ModelCompressor :param pretrained_model_name_or_path: path to model config on disk or HF hub :return: matching compressor if config contains a sparsity config rr)ÚCompressionConfigNÚformat©Úconfig) Úcompressed_tensors.compressorsrÚcompressed_tensors.configr(r Úfrom_pretrainedÚparse_sparsity_configÚgetÚload_from_registry)r&rr(r+Úsparsity_configr)Ú compressor©r4úV/home/ubuntu/vllm_env/lib/python3.10/site-packages/compressed_tensors/utils/helpers.pyrCs rÚnamecCs| tdd¡ dtd¡S)zÞ Remove FSDP wrapper prefixes from a module name Accounts for scenario where FSDP_WRAPPER_NAME is at the end of the name, as well as in the middle. :param name: name to strip :return: stripped name Ú.Ú)ÚreplaceÚFSDP_WRAPPER_NAME)r6r4r4r5r[sÿrú2:4ÚmaskcCsPttt| d¡ƒƒ\}}| d|¡}|dkjdd}t ||k¡ ¡s&t ƒ‚dS)a, :param tensor: tensor to check :param mask: mask structure to check for, in the format "n:m" :return: True if the tensor follows the mask structure, False otherwise. Note, some weights can incidentally be zero, so we check for atleast n zeros in each chunk of size m ú:éÿÿÿÿré©ÚdimT) ÚtupleÚmapÚintÚsplitÚviewÚsumÚtorchÚallÚitemÚ ValueError)Útensorr<ÚnÚmÚzero_countsr4r4r5rhs rÚmodelÚ new_modulecCsTd|vr| dd¡d}|t|ƒdd…}| |¡}nd}|}|}t|||ƒdS)Nr7r?rr8)ÚrsplitÚlenÚ get_submoduleÚsetattr)rPr6rQÚparent_nameÚ child_nameÚparentr4r4r5rsrÚcompression_configcCs.zddlm}t||ƒWStyYdSw)zÖ Returns True if CompressedTensorsConfig is available from transformers and compression_config is an instance of CompressedTensorsConfig See: https://github.com/huggingface/transformers/pull/31704 r)ÚCompressedTensorsConfigF)Ú&transformers.utils.quantization_configrZÚ isinstanceÚImportError)rYrZr4r4r5rsÿrÚobjÚ chain_strc Os‚t|ƒdkr d}|d}n d|vrd}|d}nd}| d¡}|}|D]}t||ƒs9|r0|St|›d|›ƒ‚t||ƒ}q#|S)zê Chain multiple getattr calls, separated by `.` :param obj: base object whose attributes are being retrieved :param chain_str: attribute names separated by `.` :param default: default value, throw error otherwise r?TrÚdefaultFr7z object has no attribute )rSrEÚhasattrÚAttributeErrorÚgetattr) r^r_ÚargsÚkwargsÚhas_defaultr`Ú attr_namesÚresÚ attr_namer4r4r5rœs rÚfuture_nameÚmessagecsdtdtf‡‡fdd„}|S)zË Decorator to mark functions as deprecated :param new_function: Function called in place of deprecated function :param message: Deprecation message, replaces default deprecation message Úfuncr'csFˆdurˆj›d‰ˆdurˆdˆ›d7‰tˆƒ‡‡fdd„ƒ}|S)Nz6 is deprecated and will be removed in a future releasez . Please use z instead.cstjˆtddˆ|i|¤ŽS)Né)Ú stacklevel)ÚwarningsÚwarnÚDeprecationWarning)rdre)rlrkr4r5ÚwrappedÏsz.deprecated..decorator..wrapped)Ú__name__r)rlrr©rjrk)rlr5Ú decoratorÅs ÿzdeprecated..decorator)r)rjrkrur4rtr5r»s rc@s:eZdZdZedeeeffdd„ƒZdd„Zdd„Z d S) rzˆ A mixin for enums to allow aliasing of enum members Example: >>> class MyClass(Aliasable, int, Enum): >>> ... r'cCstƒ‚©N)ÚNotImplementedErrorr4r4r4r5Úget_aliasesâszAliasable.get_aliasescCslt||jƒr | ¡}|j|jkp| |j|j¡| |j|j¡kS| ¡}| |j|j¡}| ||¡}||kSrv)r\Ú __class__rxÚvaluer0)ÚselfÚotherÚaliasesÚ self_valueÚother_valuer4r4r5Ú__eq__æsÿÿzAliasable.__eq__cCs|j |j|j¡}t|ƒSrv)r}r0rzÚhash)r{Úcanonical_valuer4r4r5Ú__hash__ószAliasable.__hash__N) rsÚ __module__Ú__qualname__Ú__doc__ÚstaticmethodrÚstrrxr€rƒr4r4r4r5rÙs rrLÚshard_sizesrAcCsTt|ƒ| |¡kr tdƒ‚g}d}|D]}||}| |||¡}| |¡|}q|S)aÐ Shards a tensor into a list of tensors along a given dimension. raises: ValueError: If the sum of shard_sizes does not match the size of the tensor along the given dimension. :param tensor: The input tensor to shard. :param shard_sizes : List of sizes for each shard along the specified dimension. :param dim : The dimension along which to shard the tensor. :returns: A list of tensors sharded along the specified dimension. zSSum of shard_sizes must equal the size of the tensor along the specified dimension.r)rGÚsizerKÚnarrowÚappend)rLr‰rAÚshardsÚ start_idxrŠÚend_idxÚshardr4r4r5røsÿ rcsª|stdƒ‚dd„|Dƒ}t|ƒdkrtdƒ‚t|djƒ}t‡fdd„|Dƒƒ|ˆ<tj||dj|djd }d}|D]}|jˆ}| ˆ||¡ |¡||7}q=|S) zé Combine decompressed shards along a given dimension using `narrow`. :param shards: List of decompressed shard tensors. :param dim: Dimension to combine along (default: 0). :return: Combined decompressed tensor. zThe list of shards is empty.cSsh|]}|j’qSr4)Údtype©Ú.0rr4r4r5Ú $sz!combine_shards..r?z$All shards must have the same dtype.rc3s|]}|jˆVqdSrv)Úshaper’r@r4r5Ú *s€z!combine_shards..)r‘Údevice)rKrSÚlistr•rGrHÚzerosr‘r—r‹Úcopy_)rrAÚshard_dtypesÚtotal_shapeÚcombinedÚshard_offsetrÚ shard_sizer4r@r5rs rÚ bytemaskscCs"tj| ¡ddd}t |¡}|S)a Converts a bytemask tensor to a bitmask tensor to reduce memory. Shape RxC will be compressed to R x ceil(C/8) :param bytemasks: mask tensor where each byte corresponds to a weight :return: mask tensor where each bit corresounds to a weight r>Úlittle)ÚaxisÚbitorder)ÚnumpyÚpackbitsrHÚ from_numpy)r Úpacked_bits_numpyÚpacked_bits_torchr4r4r5r9s rÚpacked_bitmasksÚoriginal_shapecCs8tj| ¡ ¡d|ddd}t | |¡ t¡¡}|S)a# Converts a bitmask tensor back to a bytemask tensor for use during decompression :param packed_bitmasks: mask tensor where each bit corresponds to a weight :param original_shape: dense shape to decompress to :return: boolean mask of weights in the original dense shape r>r¡)r¢Úcountr£)r¤Ú unpackbitsÚcpurHr¦ÚreshapeÚastypeÚbool)r©rªÚ unpacked_bitsÚunpacked_bitmasks_torchr4r4r5rGs üÿrÚbaseÚattrrzc csrtƒ}t|||ƒ}t|||ƒzdVW||ur!t|||ƒdSt||ƒdS||ur3t|||ƒwt||ƒw)aØ Patch the value of an object attribute. Original value is restored upon exit :param base: object which has the attribute to patch :param attr: name of the the attribute to patch :param value: used to replace original value Usage: >>> from types import SimpleNamespace >>> obj = SimpleNamespace() >>> with patch_attr(obj, "attribute", "value"): ... assert obj.attribute == "value" >>> assert not hasattr(obj, "attribute") N)ÚobjectrcrUÚdelattr)r³r´rzÚ _sentinelÚoriginal_valuer4r4r5ras€ýrÚbasesÚvaluesccs\t ¡}t||ƒD] \}}| t|||ƒ¡qdVWdƒdS1s'wYdS)aî Same as `patch_attr` but for a list of objects to patch Patch attribute for a list of objects with list of values. Original values are restored upon exit :param bases: objects which has the attribute to patch :param attr: name of the the attribute to patch :param values: used to replace original values. Must be same length as bases Usage: >>> from types import SimpleNamespace >>> obj1 = SimpleNamespace() >>> obj2 = SimpleNamespace() >>> with patch_attr([obj1, obj2], "attribute", ["value1", "value2"]): ... assert obj1.attribute == "value1" ... assert obj2.attribute == "value2" >>> assert not hasattr(obj1, "attribute") >>> assert not hasattr(obj2, "attribute") N)Ú contextlibÚ ExitStackÚzipÚ enter_contextr)r¹r´rºÚstackr³rzr4r4r5r ~s€ "ýr c@sVeZdZdZdeegeffdd„Zdedefdd„Zeiƒd œd e defdd„Z d S)r!a Similar to `collections.DefaultDict`, but upon fetching a key which is missing, the key is passed as arguments to the `default_factory` :param default_factory: function which takes a key as input and returns the corresponding default value Údefault_factorycCs||_tiƒ|_dSrv)rÀrÚ_factory_kwargs)r{rÀr4r4r5Ú__init__£sz!ParameterizedDefaultDict.__init__Úkeyr'cCs>t|tƒr|j|i|j¤Ž}n |j|fi|j¤Ž}|||<|Srv)r\rBrÀrÁ)r{rÃrzr4r4r5Ú__missing__§s z$ParameterizedDefaultDict.__missing__)Úfactory_kwargsrÅcGs8t|d|ƒ||WdƒS1swYdS)a" Similar to `__getitem__`, but allows passing kwargs to factory function :param \*args: args whose tuple will value will be treated as key :param factory_kwargs: keyword arguments to pass to `default_factory` :return: dictionary entry for given key rÁN)r)r{rÅrdr4r4r5r0¯s$ÿzParameterizedDefaultDict.getN)rsr„r…r†rrrÂrÄrr r0r4r4r4r5r!šs r!r+cCó>t|dƒr|jSt|dƒrt|dƒr|j|jStd|›ƒ‚)z† Get the number of attention heads used by a model :param config: model config :return: num_attention_heads of model Únum_attention_headsÚhidden_sizeÚhead_dimzˆCannot determine num_attention_heads from config. Config must define either `num_attention_heads` or both `hidden_size` and `head_dim`. )rarÇrÈrÉrKr*r4r4r5r"»ó þÿr"cCst|dƒr|jStd|›ƒ‚)z Get the number of key-value attention heads used by a model :param config: model config :return: num_key_value_heads of model Únum_key_value_headsz\Cannot determine num_key_value_heads from config. Config must define `num_key_value_heads`. )rarËrKr*r4r4r5r#Ðs ÿÿr#cCrÆ)z Get the number of dimensions used by the attention heads of a model :param config: model config :return: head_dim of model rÉrÈrÇz}Cannot determine head_dim from config. Config must define either `head_dim` or both `hidden_size` and `num_attention_heads`. )rarÉrÈrÇrKr*r4r4r5r$árÊr$)r;)NN)r)5r»roÚ functoolsrÚtypesrÚtypingrrrrrr r rrr¤rHÚtransformersr rrr,rÚ__all__r:rˆrrr°rÚnnÚModulerrrrrÚTensorrDrrrrÚcontextmanagerrµrr Údictr!r"r#r$r4r4r4r5Úsv,ÿ þ ÿÿÿ þ ÿÿÿÿ þ !ÿÿ þ !