Hugging Face Installation Quick Inference with Pipelines Loading Models and Tokenizers Tokenizer Usage Fine-Tuning with Trainer API LoRA / PEFT Fine-Tuning Datasets Library Publishing Models to Hub Inference API Key Concepts - AutoClasses : , automatically detect the right architecture - device map="auto" : Distributes model layers across available GPUs/CPU - load in 4bit/8bit : Quantization via bitsandbytes for reduced memory usage - PEFT/LoRA : Train only a small fraction of parameters — faster and cheaper - Datasets streaming : for huge datasets without downloading - Trainer : High-level A…