Upload speeds extremely slow / stalling since April 1st
Since yesterday afternoon (April 1st), I’ve been experiencing extremely slow upload speeds when uploading GGUF model files to the Hub using hf upload . The uploads start at a reasonable speed but progressively slow down from ~1 MB/s, then downgrades to a few KB/s, and eventually stall completely at ~110 KB/s with seemingly no progress at all. What I’ve tried: Uploading all files at once vs single file, same issue Disabling xet ( HF_HUB_ENABLE_XET=0 ) and hf-transfer ( HF_HUB_ENABLE_HF_TRANSFER=0 ), same issue Using an older version of huggingface-hub (0.36.2) — same issue Checked status.huggingface.co , no reported issues My internet connection is fine for everything else The pattern is consistent: uploads begin at normal speed, then gradually degrade over a few minutes until they complete
Could not retrieve the full article text.
Read on discuss.huggingface.co →discuss.huggingface.co
https://discuss.huggingface.co/t/upload-speeds-extremely-slow-stalling-since-april-1st/174910Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modelversionreport
Which cloud architecture decision do tech leaders regret most? Treating AI like just another workload
For years, cloud strategy rewarded standardization. Treat everything as a workload, abstract the differences, optimize for scale and cost. That mindset helped enterprises modernize faster than any previous infrastructure shift. Applying that same mindset to AI is one of the most consequential architectural mistakes I see senior IT leaders make. In executive rooms, the logic is understandable. We already have a hardened cloud platform. We have guardrails, FinOps processes, security controls and autoscaling policies. Why not onboard AI into the same architecture and move quickly? Because AI is not just another workload category. It is a different behavioral system. That distinction sounds subtle. In practice, it changes everything. The assumption that worked for everything else Traditional c
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Open Source AI

Which model do you guys use for NSFW image generation ?
I am new to this field and exploring the different models to generate NSFW images. What are your top models to do that ? Can I also generate NSFW videos ? I am planning to self host the model so ideally would want open source model suggestions. How do you maintain consistency across characters ? Do you use LORA or some other technique ? Just curious and keen to know what the community uses in order to get things going for me. submitted by /u/ElectricalVariety641 [link] [comments]

Peft 0.18.1 crashing when fine-tuning
Hi, peft Version: 0.18.1 is crashing when attempting to fine-tune google/gemma-4-E2B. The error msg is shown below. I checked and 0.18.1 is the latest version. Will there be an update soon or is there a workaround? I’d appreciate any help. thanks! ValueError: Target module Gemma4ClippableLinear( (linear): Linear(in_features=768, out_features=768, bias=False) ) is not supported. Currently, only the following modules are supported: `torch.nn.Linear`, `torch.nn.Embedding`, `torch.nn.Conv1d`, `torch.nn.Conv2d`, `torch.nn.Conv3d`, `transformers.pytorch_utils.Conv1D`, `torch.nn.MultiheadAttention.`. 1 post - 1 participant Read full topic
v4.3.1
Changes Gemma 4 support with full tool-calling in the API and UI. 🆕 ik_llama.cpp support : Add ik_llama.cpp as a new backend through new textgen-portable-ik portable builds and a new --ik flag for full installs. ik_llama.cpp is a fork by the author of the imatrix quants, including support for new quant types, significantly more accurate KV cache quantization (via Hadamard KV cache rotation, enabled by default), and optimizations for MoE models and CPU inference. API: Add echo + logprobs for /v1/completions . The completions endpoint now supports the echo and logprobs parameters, returning token-level log probabilities for both prompt and generated tokens. Token IDs are also included in the output via a new top_logprobs_ids field. Further optimize my custom gradio fork, saving up to 50 ms




Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!