• 5 Posts
  • 16 Comments
Joined 1 year ago
cake
Cake day: July 7th, 2023

help-circle
  • Thanks for your answer. I think to be clear, what I’m looking for is a kind of masked fine-tuning. You see, I want to “steer” a particular output instead of providing complete examples, which are costly to create.

    The steering would be something like this:

    1. I have an LLM generate a sequence.
    2. I find exactly where the LLM goes “off track” and correct it there (for only maybe 10-20 tokens instead of correcting the rest of the generation manually).
    3. The LLM continues “on track” until it goes off track again.

    What I would like to do is train the model based on these corrections I give it, where many corrections might be part of the same overall generation. Conceptually I think each correction must have some training value. I don’t know much about masking, but what I mean here is that I don’t want it to train on a few tens or hundreds of (incomplete) samples but rather thousands of (masked) “steers” that correct the course of the rest of the sample’s generated text.










  • Can SFT be used on partial generations? What I mean by a “steer” is a correction to only a portion, and not even the end, of model output.

    For example, a “bad” partial output might be:

    <assistant> Here are four examples:
    1. High-quality example 1
    2. Low-quality example 2
    

    and the “steer” might be:

    <assistant> Here are four examples:
    1. High-quality example 1
    2. High-quality example 2
    

    but the full response will eventually be:

    <assistant> Here are four examples:
    1. High-quality example 1
    2. High-quality example 2
    3. High-quality example 3
    4. High-quality example 4
    

    The corrections don’t include the full output.