• simeon@reddthat.com
    link
    fedilink
    arrow-up
    2
    arrow-down
    6
    ·
    1 day ago

    Ollama is misrepresenting what model you are actually running by falsely labeling the distills, so qwen or llama fine-tunes based on actual r-1 output, as deepseek-r1. So you have probably only run the fine-tunes(unless you used the 671b model). These fine-tunes more probable to rely on the training of their base models, which is why the llama based models(8b and 70b) could be giving you more liberal answers. In my experience running these models using llama.cpp, prompts like “What happened at tianamen square” and “Is Taiwan a county?” lead to refusals(closing the think tags immediately and responding some vague Chinese propaganda). Since you are using ollama, the front end/UI you are using with it probably injects another token after the <think> token, breaking the censoreship